library(tidyverse) 
library(skimr)
library(codebook)
heartdata <- read.csv("https://raw.githubusercontent.com/Maz-SM-22/Neural-Network-Classifier/master/heart_statlog_cleveland_hungary_final.csv")
save(heartdata,file="./heartdata.RData")

Stroke Dataset Codebook

codebook(heartdata)
## No missing values.

Metadata

Description

Dataset name: heartdata

The dataset has N=1190 rows and 12 columns. 1190 rows have no missing values on any column.

Metadata for search engines
  • Date published: 2021-04-01
x
age
sex
chest.pain.type
resting.bp.s
cholesterol
fasting.blood.sugar
resting.ecg
max.heart.rate
exercise.angina
oldpeak
ST.slope
target

#Variables

age

Distribution

Distribution of values for age

Distribution of values for age

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
age numeric 0 1 28 54 77 53.72017 9.358203 ▁▅▇▇▁ NA

sex

Distribution

Distribution of values for sex

Distribution of values for sex

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
sex numeric 0 1 0 1 1 0.7638655 0.4248843 ▂▁▁▁▇ NA

chest.pain.type

Distribution

Distribution of values for chest.pain.type

Distribution of values for chest.pain.type

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
chest.pain.type numeric 0 1 1 4 4 3.232773 0.9354804 ▁▃▁▃▇ NA

resting.bp.s

Distribution

Distribution of values for resting.bp.s

Distribution of values for resting.bp.s

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
resting.bp.s numeric 0 1 0 130 200 132.1538 18.36882 ▁▁▅▇▁ NA

cholesterol

Distribution

Distribution of values for cholesterol

Distribution of values for cholesterol

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
cholesterol numeric 0 1 0 229 603 210.3639 101.4205 ▃▇▇▁▁ NA

fasting.blood.sugar

Distribution

Distribution of values for fasting.blood.sugar

Distribution of values for fasting.blood.sugar

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
fasting.blood.sugar numeric 0 1 0 0 1 0.2134454 0.4099118 ▇▁▁▁▂ NA

resting.ecg

Distribution

Distribution of values for resting.ecg

Distribution of values for resting.ecg

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
resting.ecg numeric 0 1 0 0 2 0.6983193 0.8703588 ▇▁▂▁▃ NA

max.heart.rate

Distribution

Distribution of values for max.heart.rate

Distribution of values for max.heart.rate

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
max.heart.rate numeric 0 1 60 140 202 139.7328 25.51764 ▁▃▇▇▂ NA

exercise.angina

Distribution

Distribution of values for exercise.angina

Distribution of values for exercise.angina

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
exercise.angina numeric 0 1 0 0 1 0.387395 0.4873599 ▇▁▁▁▅ NA

oldpeak

Distribution

Distribution of values for oldpeak

Distribution of values for oldpeak

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
oldpeak numeric 0 1 -2.6 0.6 6.2 0.9227731 1.086337 ▁▇▆▁▁ NA

ST.slope

Distribution

Distribution of values for ST.slope

Distribution of values for ST.slope

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
ST.slope numeric 0 1 0 2 3 1.62437 0.6104592 ▁▇▁▇▁ NA

target

Distribution

Distribution of values for target

Distribution of values for target

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
target numeric 0 1 0 1 1 0.5285714 0.4993929 ▇▁▁▁▇ NA

Missingness report

Codebook table

JSON-LD metadata

The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.

{
  "name": "heartdata",
  "datePublished": "2021-04-01",
  "description": "The dataset has N=1190 rows and 12 columns.\n1190 rows have no missing values on any column.\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name                |label | n_missing|\n|:-------------------|:-----|---------:|\n|age                 |NA    |         0|\n|sex                 |NA    |         0|\n|chest.pain.type     |NA    |         0|\n|resting.bp.s        |NA    |         0|\n|cholesterol         |NA    |         0|\n|fasting.blood.sugar |NA    |         0|\n|resting.ecg         |NA    |         0|\n|max.heart.rate      |NA    |         0|\n|exercise.angina     |NA    |         0|\n|oldpeak             |NA    |         0|\n|ST.slope            |NA    |         0|\n|target              |NA    |         0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.2).",
  "keywords": ["age", "sex", "chest.pain.type", "resting.bp.s", "cholesterol", "fasting.blood.sugar", "resting.ecg", "max.heart.rate", "exercise.angina", "oldpeak", "ST.slope", "target"],
  "@context": "http://schema.org/",
  "@type": "Dataset",
  "variableMeasured": [
    {
      "name": "age",
      "@type": "propertyValue"
    },
    {
      "name": "sex",
      "@type": "propertyValue"
    },
    {
      "name": "chest.pain.type",
      "@type": "propertyValue"
    },
    {
      "name": "resting.bp.s",
      "@type": "propertyValue"
    },
    {
      "name": "cholesterol",
      "@type": "propertyValue"
    },
    {
      "name": "fasting.blood.sugar",
      "@type": "propertyValue"
    },
    {
      "name": "resting.ecg",
      "@type": "propertyValue"
    },
    {
      "name": "max.heart.rate",
      "@type": "propertyValue"
    },
    {
      "name": "exercise.angina",
      "@type": "propertyValue"
    },
    {
      "name": "oldpeak",
      "@type": "propertyValue"
    },
    {
      "name": "ST.slope",
      "@type": "propertyValue"
    },
    {
      "name": "target",
      "@type": "propertyValue"
    }
  ]
}`

Skim Stroke Dataset Summary

skim(heartdata)
Data summary
Name heartdata
Number of rows 1190
Number of columns 12
_______________________
Column type frequency:
numeric 12
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
age 0 1 53.72 9.36 28.0 47 54.0 60.00 77.0 ▁▅▇▇▁
sex 0 1 0.76 0.42 0.0 1 1.0 1.00 1.0 ▂▁▁▁▇
chest.pain.type 0 1 3.23 0.94 1.0 3 4.0 4.00 4.0 ▁▃▁▃▇
resting.bp.s 0 1 132.15 18.37 0.0 120 130.0 140.00 200.0 ▁▁▅▇▁
cholesterol 0 1 210.36 101.42 0.0 188 229.0 269.75 603.0 ▃▇▇▁▁
fasting.blood.sugar 0 1 0.21 0.41 0.0 0 0.0 0.00 1.0 ▇▁▁▁▂
resting.ecg 0 1 0.70 0.87 0.0 0 0.0 2.00 2.0 ▇▁▂▁▃
max.heart.rate 0 1 139.73 25.52 60.0 121 140.5 160.00 202.0 ▁▃▇▇▂
exercise.angina 0 1 0.39 0.49 0.0 0 0.0 1.00 1.0 ▇▁▁▁▅
oldpeak 0 1 0.92 1.09 -2.6 0 0.6 1.60 6.2 ▁▇▆▁▁
ST.slope 0 1 1.62 0.61 0.0 1 2.0 2.00 3.0 ▁▇▁▇▁
target 0 1 0.53 0.50 0.0 0 1.0 1.00 1.0 ▇▁▁▁▇